Goto

Collaborating Authors

 rejection function






We thank the reviewers for acknowledging our contributions and for providing valuable feedback

Neural Information Processing Systems

We thank the reviewers for acknowledging our contributions and for providing valuable feedback. The NVIDIA Titan X (Pascal) is rated at 11.0 TFLOPS, so the latency of The critic architecture also follows the WGAN-GP paper. We clarified the concatenation process in our paper and have added the missing hidden-layer citation. Thank you for pointing this out. The GWIN continues to have a positive impact.


A Unifying Post-Processing Framework for Multi-Objective Learn-to-Defer Problems

arXiv.org Artificial Intelligence

Learn-to-Defer is a paradigm that enables learning algorithms to work not in isolation but as a team with human experts. In this paradigm, we permit the system to defer a subset of its tasks to the expert. Although there are currently systems that follow this paradigm and are designed to optimize the accuracy of the final human-AI team, the general methodology for developing such systems under a set of constraints (e.g., algorithmic fairness, expert intervention budget, defer of anomaly, etc.) remains largely unexplored. In this paper, using a $d$-dimensional generalization to the fundamental lemma of Neyman and Pearson (d-GNP), we obtain the Bayes optimal solution for learn-to-defer systems under various constraints. Furthermore, we design a generalizable algorithm to estimate that solution and apply this algorithm to the COMPAS and ACSIncome datasets. Our algorithm shows improvements in terms of constraint violation over a set of baselines.


Learning When to Say "I Don't Know"

arXiv.org Artificial Intelligence

We propose a new Reject Option Classification technique to identify and remove regions of uncertainty in the decision space for a given neural classifier and dataset. Such existing formulations employ a learned rejection (remove)/selection (keep) function and require either a known cost for rejecting examples or strong constraints on the accuracy or coverage of the selected examples. We consider an alternative formulation by instead analyzing the complementary reject region and employing a validation set to learn per-class softmax thresholds. The goal is to maximize the accuracy of the selected examples subject to a natural randomness allowance on the rejected examples (rejecting more incorrect than correct predictions). We provide results showing the benefits of the proposed method over na\"ively thresholding calibrated/uncalibrated softmax scores with 2-D points, imagery, and text classification datasets using state-of-the-art pretrained models. Source code is available at https://github.com/osu-cvl/learning-idk.


Learning to Reject with a Fixed Predictor: Application to Decontextualization

arXiv.org Artificial Intelligence

Large language models, often trained with billions of parameters, have achieved impressive performance in recent years (Raffel et al., 2019) and are used in a wide variety of natural language generation tasks. However, their output is sometimes undesirable, with hallucinated content (Maynez et al., 2020; Filippova, 2020), and much work remains to fully understand their properties. In many applications, such as healthcare, question-answering systems, or customer service, incorrect predictions are particularly costly and must be avoided. This motivates the design of algorithms for large language models and other NLP tasks that achieve high precision on a large fraction of the input set, while abstaining on the rest. How can we devise such accurate models that allow a reject option?


ATRO: Adversarial Training with a Rejection Option

arXiv.org Machine Learning

This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples. While recent machine learning algorithms achieve high prediction performance, they are empirically vulnerable to adversarial examples, which are slightly perturbed data samples that are wrongly classified. In real-world applications, adversarial attacks using such adversarial examples could cause serious problems. To this end, various methods are proposed to obtain a classifier that is robust against adversarial examples. Adversarial training is one of them, which trains a classifier to minimize the worst-case loss under adversarial attacks. In this paper, in order to acquire a more reliable classifier against adversarial attacks, we propose the method of Adversarial Training with a Rejection Option (ATRO). Applying the adversarial training objective to both a classifier and a rejection function simultaneously, classifiers trained by ATRO can choose to abstain from classification when it has insufficient confidence to classify a test data point. We examine the feasibility of the framework using the surrogate maximum hinge loss and establish a generalization bound for linear models. Furthermore, we empirically confirmed the effectiveness of ATRO using various models and real-world datasets.


Generative Well-intentioned Networks

arXiv.org Machine Learning

We propose Generative Well-intentioned Networks (GWINs), a novel framework for increasing the accuracy of certainty-based, closed-world classifiers. A conditional generative network recovers the distribution of observations that the classifier labels correctly with high certainty. We introduce a reject option to the classifier during inference, allowing the classifier to reject an observation instance rather than predict an uncertain label. These rejected observations are translated by the generative network to high-certainty representations, which are then relabeled by the classifier. This architecture allows for any certainty-based classifier or rejection function and is not limited to multilayer perceptrons. The capability of this framework is assessed using benchmark classification datasets and shows that GWINs significantly improve the accuracy of uncertain observations.